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Let (X, Y) be a bivariate random vector. The estimation of a probability of the form 
P(X < V I X > t) is challenging when t is large, and a fruitful approach consists in study- 
ing, if it exists, the limiting conditional distribution of the random vector (X, Y), suitably 
normalized, given that X is large. There already exists a wide literature on bivariate 
models for which this limiting distribution exists. In this paper, a statistical analysis of 
this problem is done. Estimators of the limiting distribution (which is assumed to exist) 
and the normalizing functions are provided, as well as an estimator of the conditional 
quantile function when the conditioning event is extreme. Consistency of the estimators 
is proved and a functional central limit theorem for the estimator of the limiting distribu- 
te \ tion is obtained. The small sample behavior of the estimator of the conditional quantile 
CN ■ function is illustrated through simulations. 

(N 

q '. 1 Introduction 

oo 

Let (A, Y) be a bivariate random vector for which the conditional distribution of Y given 
that X > t is of interest, for values of t such that the conditioning event is a rare event. This 
^ ■ happens for example when the possible contagion between two dependent market returns 

X and Y is investigated, see e.g. Bradley and Taqqu (2004) or Abdous et al. (2008). The 
estimation of a probability of the form P(Y < y \ X > t) starts to be challenging as soon as 
t is large, since the conditional empirical distribution becomes useless when no observations 
are available. A fruitful alternative approach consists in studying, if it exists, the limiting 
distribution of the random vector (A, Y) conditionally on X to be large. This corresponds to 
assuming that there exist functions m, a and ip, and a bivariate distribution function (cdf) 
F on [0, oo) x (—00,00) with non degenerate marginal distributions, such that 



lim F(X < t + ip(t)x ; Y < m(t) + a(t)y \ X > t) = F(x, y) (1) 



at all point of continuity of F. This approach was suggested by Heffernan and Tawn (2004) 
and investigated by Heffernan and Resnick (2007). Models for which condition (1) holds have 
already been investigated in many references. Eddy and Gale (1981) and Berman (1992) 
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proved that (1) holds for spherical distributions; bivariate elliptic distributions were inves- 
tigated by Abdous et al. (2005), multivariate elliptic distributions and related distributions 
by Hashorva (2006); Hashorva et al. (2007). The analysis of the underlying geometric struc- 
ture (ellipticity of the level sets of the densities) has lead to various generalizations by Barbe 
(2003) and Balkema and Embrechts (2007). See also Fougeres and Soulier (2010) for a recent 
review on the subject. 

An important issue that still has to be addressed is the statistical estimation of the func- 
tions a and m that appear in (1), as well as the limiting distribution function F. This is the 
aim of the present paper. Two problems are considered. The first one is the nonparamet- 
ric estimation of the limiting distribution and of the normalizing functions. This allows for 
instance to test for a specific limiting distribution, e.g. the standard Gaussian distribution 
which appears in many examples. Since we are also interested in the case where the condition- 
ning event is beyond the range of observations, a semiparametric procedure will be defined 
to allow this extrapolation. This can only be done under more restrictive assumptions, which 
are satisfied by most models already investigated in the literature. 

The paper is organized as follows. In Section 2, we rephrase (1) in terms of vague 
convergence of measures in order to use the point process techniques and the results of 
Heffernan and Resnick (2007). We also introduce moment assumptions which are needed 
to prove the consistency of the non parametric estimators introduced in Section 3. A func- 
tional central limit theorem is obtained under a second order condition. A specific analysis 
of the case of a limiting distribution with product form is done in Section 4. The functional 
central limit theorem is used to derive a goodness of fit test for the second marginal of the 
limiting distribution F. In Section 4.2, semi-parametric estimators that allow extrapolations 
beyond the range of the observations are studied and applied to the estimation of conditional 
quantiles when the conditioning event is extreme. A simulation study is given in Section 5, 
which illustrates the behavior of the goodness of fit test proposed in Section 4.1 and of the 
estimator of the conditional quantile proposed in Section 4.2. This results are applied in 
Section 6 to some financial data. Section 7 collects the proofs. 

2 Assumptions and preliminary results 

We first rephrase the convergence (1) in terms of vague convergence of measures, in order 
to use point process techniques and the results of Heffernan and Resnick (2007). See also 
Das and Resnick (2008, 2009). Condition (1) implies that the marginal distribution of X 
belongs to the domain of attraction of an extreme value distribution with index 7 6 K, i.e. 
there exist normalizing sequences {a n } and {b n } with a n > such that P(maxi<j< n (Aj — 
bn)/a n < x) converges to exp{— P 7 (x)} for each x such that 1 + 72 > 0, where -P 7 (x) = 
(1 + 72)" 1 / 7 if 7 7^ and Pq(x) = e~ x , and the random variables Xi are independent copies 
of X. For simplicity, we assume that 7 > 0, and in the case 7 = we assume that the right 
endpoint of the marginal distribution of X is infinite. 

Recall that measure defined on the Borel sigma-field of a locally compact separable space 
E is called a Radon measure if it is finite on compact sets. A sequence of Radon measures 
a n defined on E converges vaguely to a Radon measure a if f E f(x)a n (dx) converges to 
f E f(x)a(dx) for all compactly supported function /. See Resnick (1987, Chapter 3) or 
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Heffernan and Resnick (2007, Appendix A3). We will consider vague convergence of Radon 
measures denned on the Borel sigma-fields of (—1/7,00] or (—1/7,00] x [—00,00]. 

Assumption 1. There exist 7 > 0, monotone functions a, b, m and if) such that the marginal 
distribution of X is in the domain of attraction of the extreme value distribution with extreme 
value index 7 and the sequence of measures v n defined by 

\ [ ip o b{n) a o byn) J 

converges vaguely on (—1/7,00] x [—00,00] to a Radon measure v such that i/([0, 00) x 
(—00,00)) = 1, the distribution function y 1— > z/([0,oo) x (— 00, y]) is non degenerate and 
the application (x,y) 1— > i^([x,oo) x (— 00, y]) is continuous on (—1/7,00] x [—00,00]. 

The link between Assumption 1 and Equation (1) is that the limiting distribution F is 
given, for all positive x and real y, by 

F(x,y) = v([0,x] x (-00, y\) . 

Assumption 1 also implies that F is continuous and that the sequence of probability distri- 
bution functions F n defined, for all positive x and real y, by 

F n (x,y) = v n ([Q,x] x (-00, y]) 

converges to F locally uniformly. Assumption 1 can also be interpreted as the weak con- 
vergence to F of the vector {X — b(n))/ip o b(n), (Y — mo b(n))/a o b{n)) conditionnally on 
X > b(n), i.e. for all bounded continuous function h on [0,oo) x (—00,00), 



lim E 

n— >oo 



, X - b(n) Y -mo b{n) . 

h . 7 > r , rr-^ ) x > b H 

tp o b{n) a o b[n) 



00 POO 



h(x,y)F{dx,dy) . (2) 



Remark 1 . All results concerning only the marginal distribution of X are obtained by applying 
the usual extreme value theory. In particular, the functions ip and b are determined by the 
marginal distribution of X only. The function b can and will be chosen as b = (1/(1 — Fx))*~ 
where Fx is the distribution function of X. The function ip satisfies 

lim 77-^ = 1 + 7 u . (3) 

See (Resnick, 1987, Propositions 1.4 and 1.11). For any x > — 1/7, it holds that 

v([x, 00] x [—00, 00]) = (1 + 7X)" 1 / 7 , 

with the usual convention that this expression must be read as e~ x when 7 = 0. 

Remark 2. Assumption 1 has little implications on the functions a and m and on the distri- 
bution ^ defined by 



= / / ^(dz,dy) . 
Jo J -00 



If Y is independent of X, then \& is the distribution of Y, a = 1 and m = 0. Thus ^ can be 
any probability distribution. In particular, it is not necessarily an extreme value distribution. 
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Remark 3. If the pair (X, Y) satisfies Assumption 1, then so does any afhne transforma- 
tion of (X,Y). For instance, if X and Y have finite mean and variance, then {{X — 
E[X])/var 1 / 2 (X), (Y - E[y])/var 1 /2(y)) a i so satisfies Assumption 1. But non linear trans- 
formations of (X, Y) do not necessarily satisfy the assumption. In particular, the usual 
(in extreme value theory) transformation of X and Y to random variables with prescribed 
marginal distributions, is not always possible, as investigated in (Heffernan and Resnick, 2007, 
Section 7). It is never possible in the cases where the joint limiting distribution is a product 
measure. Consequently, we do not make any specific assumption on the marginal distributions 
of X and Y. 

Obviously, the functions a and m are defined up to asymptotic equivalence, i.e. if m' and 
a' satisfy 

a'(x) mix) — m'(x) 
lim — —— = 1 , lim — = , 

x—>oo a(x) x^co a(x) 

then the measure u' n defined as u n but with a' and m! instead of a and m converges vaguely 
to the same limit measure v. Beyond this trivial remark, the following result summarizes 
Heffernan and Resnick (2007, Propositions 1 and 2) and contains most of what can be infered 
from Assumption 1. Recall that a function / defined on a neighborhood of infinity is said to 
be regularly varying if there exists a constant a£R such that 

x^co j[X) 

for all t > 0. If a = 0, the function is called slowly varying. 

Lemma 1. Under Assumption 1, there exists ( S 1 such that the function aob is regularly 
varying at infinity with index £ and the function m satisfies 

m o b(tx) — m o hit) T , s 

km — = J c (x) , 

t^oo a o b(t) 

with Jq(x) = (x^ — 1)/C if C 7^ and Jq{x) = clog(x) for some c£K, and the convergence is 
locally uniform on (0, oo). 

For a sequence (Xj, Yj), 1 < i < n, let Xr n .j\ denote the i-th. order statistic and YL.j] denote 
its concomitant, i.e. Xr n .^,. . . ,Xf n . n -\ is the ordering of X\, . . . ,X n in increasing order, and 
Y[ n .j] is the y-variable corresponding to Xr n .£\. 

Recall that an intermediate sequence is a sequence of integers k n such that limn^oo k n = 
liuin^oon/kn = oo. In accordance with common use and for the clarity of notation, the 
dependence on n will be implicit in the sequel. 

Define the random measure 
1 n 

Vn = j 'y^^({X i -b(n/k)}/^ob{n/k),{Y i -mob(n/k)}/aob(n/k)) ■ (4) 
i=l 

Applying Resnick (1986, Proposition 5.3) (see also Resnick (1987, Exercise 3.5.7)), we straight- 
forwardly obtain the following result. 
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Proposition 2. If Assumption 1 holds, then for any intermediate sequence k, u n converges 
weakly to u locally uniformly on (—1/7,00] x [—00,00]. 

Consequently, £^([0, x] x (—00, y\) converges weakly locally uniformly to F{x, y). But v n is 
not an estimator, since it involves the unknown functions a and m. In order to define estima- 
tors of these functions, and of the distribution function F, we will need to prove convergence 
of integrals of unbounded functions with respect to the random measure v n . Therefore we 
need to strengthen Assumption 1. 

Assumption 2. There exists p* > 0, q* > such that for any e G (0, 1/7), 

/oo poo poo poo 

/ \x\ p \y\ q v n (dx,dy) = / / \x\ p \y\ g v(dx,dy) . (5) 
-e J-00 J-e J-00 



Condition (5) can be seen as a strengthening of (1) and (2) in order to obtain the con- 
vergence of conditional moments. Under Assumption 2, for all < p < p* and < q < q*, it 
holds that 

, lim Tv77\ a( + \ = / / x p \y\ q p{dx,dy) . (6) 

tpP{t)ai(t) J y^oo 



For the reason mentioned in Remark 1, Assumption 1 implies the convergence (5) with 
q* = and any p* < I/7. In applications, it will be assumed that q* > 2. The function a and 
the limiting measure v are defined up to a change of scale, thus, without loss of generality, 
we assume henceforth that 

poo poo poo 

/ / y 2 v(dx,dy) = \ y 2 *(dy) = l. (7) 

JO J —00 J —00 

Proposition 3. If Assumptions 1 and 2 hold, then for any intermediate sequence k and any 
continuous function g such that \g(x,y)\ < C(\x\ V l) p *(|y| V l) 9 *, for any e G (0, 1/7), 

/oo poo poo poo 

/ g(x,y)i> n (dx,dy) -> P / / g{x,y)v{dx,dy) . (8) 
-e J —00 J— e J— 00 



For historical interest, we can also mention the following consequence of Assumption 1, 
first stated in Eddy and Gale (1981, Theorem 6.1) in a restricted case of spherical distribu- 
tions. 

Proposition 4. Under Assumption 1, {yj n:n ] — mo b(n/k)}/a o b(n/k) converges weakly to 
If moreover v is a product measure, then {Yu. n T — mob(n/k)}/aob(n/k) is asymptotically 
independent of Xt n . n y 

Let us finally mention that Davydov and Egorov (2000) obtained functional limit theorems 
for sums of concomitants corresponding to a number k of order statistics such that k/n 0. 
Their problem differs from ours. Their assumptions on the joint distribution of the random 
pairs are much weaker than Assumption 1, but their results are of a very different nature and 
it does not seem possible to use them to derive Propositions 2-3 for instance. 
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3 Nonparametric estimation of if), a, m and F 



In this section, we introduce nonparametric estimators of the functions tp, m, a and F based 
on i.i.d. observations (Xi,Yi), . . . , (X n ,Y n ) of a bivariate distribution which satisfies Assump- 
tion 2. 



3.1 Definitions and consistency 

In order to estimate nonparametrically the limiting distribution F, we first need nonpara- 
metric estimators of the quantities ip(Xr n:n _ k \), m{X^ n . n _ k ^) and a(Xr n . n _ k \), with k an 
intermediate sequence, i.e. such that k — > oo and k/n — ► 0. The estimation of ■0(X( n:n _^)) 
is a well known estimation issue, see e.g. De Haan and Ferreira (2006, Section 4.2). If the 
extreme value index 7 of X is less than 1, then ip can be estimated as the mean residual 
life. Let 7 be a consistent estimator of 7 (see e.g. De Haan and Ferreira (2006, Chapter 3) 
or Beirlant et al. (2004, Chapter 5)) and define 

1 A k 
1 — 7 \ 

i>( X (n:n-k)) = — 7 — / S X (n:n-i+l) ~ X (n:n-k)} • (9) 



i=l 



It follows straightforwardly from Proposition 3 that ifj(Xf n:n _ k))/^ ° b(n/k) -»p 1. If it is 
moreover assumed (as in Section 4 below) that 7 = 0, then the above estimator can be 
modified accordingly: 

1 k 

^{ X {n:n-k)) = T ^{^(n:n-i+l) ~ X {n:n-k)} ■ (10) 



«=1 



In order to estimate m, define 



~/v- \ Ei=l ^jn:i»-i+l]{-^(n:tJ-i+l) X (n:n-k)} , , 

m{X {nm _ k) ) = " • (11) 

Z^j=li A (n:n-i+l) — ^{n:n-k)S 

Proposition 5. If Assumption 1 holds and Assumption 2 holds with p* > 1 and q* > 1, 

then, for any intermediate sequence k, it holds that 

mpr (n:n _ fc) ) - mo b(n/k) ^ 

a o b(n/k) P ^ ' 

where p = (1 — 7) f^° 00 xyu(dx,dy). If moreover m(x) = px and either p = and 
a(x) = 0(x) or a(x) = o(x) then m{X^ n . n _ k ^) / Xr n . n _ k \ is a consistent estimator of p. 

Remark 4. A sufficient condition for p = is the symmetry of the measure v with respect to 
the second variable. This happens in particular if v is a product measure, and the distribution 
^ is symmetric. 

We now estimate a(A( n:n _^)). Many estimators can be defined, each needing an ad hoc 
moment assumption. The one we have chosen needs q* > 2 in Assumption 2. Define 

H X (n:n-k)) = \ T ^{^[n:n-i+l] ~ ^(^(nin-fc))} 2 f • (I 2 ) 
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Proposition 6. If Assumption 1 holds and Assumption 2 holds with p* > 1 and q* > 2, and 

if H = 0, then, for any intermediate sequence k, it holds that 

K X {n:n-k))l a ° b(n/k) -►p 1 . 

Remark 5. If /x ^ 0, then a(X( n:n _ k j)/a o b(n/k) — >p t, with 

/oo 
y^(dy)+ fJ 2 . (13) 
-oo 

We can now consider the nonparametric estimator of the limiting joint distribution F. 
Define 



F(x,y) 

1 k 

= k^ 1 {X (n]n ^ + 1) <X (n:n _ k) +4,(X (n:n _ k) )x} X 1 {Y[n:n- l + l]<™(X(n:n-k)) + a(X (n:n _ k) )y} ■ ( 14 ) 



1=1 

Denote n n = if;{Xi n :n-k))/' l l ) b(n/k) and 

y _ «(^(n:n-fc)) ^ _ ^(^"(n:rt-fc) ) frfo/AQ 

aob(n/k) ' aob(n/k) 

Then 

F(sc, y) = i>„([x n , 5 n + u n x] x (-oo, £ n + u n y]) . 

Thus Propositions 2, 5 and 6 easily yield the consistency of F{x, y), as stated in the following 
theorem. 

Theorem 7. Under Assumptions 1 and 2 with 7 < 1, p* > 1 and q* > 2, if [i = 0, i/ien /or 
any intermediate sequence k, F(x,y) converges weakly to F(x,y). 

We can also define an estimator of the second marginal of F. Denote 

1 k 

i=l 

Then, under the assumptions of Theorem 7, ^ also converges to ^. Note that if fi 7^ 0, then 
^(z) converges weakly to ^(/i + rz), with r defined in (13). 

3.2 Central limit theorems 

In order to obtain central limit theorems, we need to strengthen Assumptions 1 and 2. 

Assumption 3. There exist positive real numbers p^ and q', a function c such that lim^oo c(t) 
and a Radon measure frf on (—1/7,00) x (—00,00) such that for any e E (0,1/7), and any 
measurable function h such that \h(x,y)\ < (\x\ V l) pt (|y| V l) q \ it holds that 



\h( x ^y)\^(dx,dy) < 00 , 



e J — oo 
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and 



»oo roo roo roo 



h(x,y)u n (dx,dy) - \ \ h(x,y)v(dx,dy) 

<cob(n) I I \h(x,y)\fJ(dx,dy) . (17) 



oo roo 



Remark 6. Taking h = l[o,a;]x(-oo,y]; (1?) yields 

\F n {x,y)- F(x,y)\ < co b(n)^ ([0,x] x {-oo,y}) (18) 

where F n (x,y) = i> n ([0,x] x (— oo,y]). This is a classical second order condition (see e.g. 
de Haan and Resnick (1993, Condition 4.1)), which gives a non uniform rate of convergence 
in Condition (1). The condition (17) is stronger than (18) in the sense that it moreover gives 
a rate of convergence for conditional moments. 

For a sequence k depending on n, define the random measure fi n by 

Mn = k l/2 (D n - v) 

and denote 

W n (x,y) =p, n ((x,oo) x (-oo,y]). 

The next results states the functional convergence of W n in the space X>((— 1/7, 00) x (— 00, 00)) 
of right-continuous and left-limited functions, endowed with Skorohod's J\ topology. 

Proposition 8. If Assumption 3 holds withp^ > 2 and q< > 4 and if the sequence k is chosen 
such that 

lim k 1/2 cob(n/k) = , (19) 

n— >oo 

then k is an intermediate sequence and the sequence of processes W n converges weakly in 
P((— 1/7,00) x (—00,00)) to a Gaussian process W with autocovariance function 

cov(W(x, y), W(x', y')) = v([x V x', +00] x [-00, y A y']) . (20) 

Moreover, the sequence of random measures jl n converges weakly (in the sense of finite dimen- 
sional distributions) to an independently scattered Gaussian random measure W with control 
measure v on the space of measurable functions g such that \g(x,y)\ 2 < C(x V l) pt (|y| V l) q \ 
i.e. W(g) is a centered Gaussian random variable with variance 

00 roo 

1 



g\s,t)v{ds,dt) 

1/7 J — 00 

and W(g), W(h) are independent if J ghdv = 0. 

The proof is in section 7. Applying Proposition 8, we easily obtain the following corollary. 
For i,j > 0, denote gij{x,y) = x l y 3 t {x>0} . 
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as 



Corollary 9. Under the assumptions of Proposition 8 and if moreover p, = 0, then 

1 1/2 \ X {n-.n-k) ~ b(n/k) m( x (n:n-k)) -mob{n/k) a(X( ra:n _ fc) ) 
\ ipob(n/k) aob(n/k) ' aob(n/k) 

converges jointly with k l l 2 (v n — v) to a Gaussian vector which can be expressed 

(W(5o,o),(l-7)W(5i,i)>^(5o, 2 )) • 

Proposition 8 and Corollary 9 straightforwardly yield a functional central limit theorem 
for the estimator \£ of ^ defined in (16). Recall that F(x,y) = i/([0,x] x (— oo,y]). 

Theorem 10. If Assumption 3 holds with pT > 2 and > 4, if p = 0, if F (and hence 
is differentiable and if the intermediate sequence k satisfies (19), then k 1 ^ 2 ^ — VP) converges 
in T>((— oo,+oo)) to the process M defined by 

dF 1 
M(y) = W(0, y) - — (0, y)W(g ,o) + - tM<?i,i) + fTfoM ■ ( 21 ) 

We prove Theorem 10 here in order to explain the last two terms in the right hand side 
of (21). 



Proof of Theorem 10. Recall the definitions of v n and £ n in (15) and define 

X(n;n-k)-K n /k) 



(22) 



ij) o b(n/k) 
Then 

k l ' 2 {i>{y) - 9(y)} = k l l 2 {D n ([x n , oo) x (-co, £ n + v n y)) - V(y)} 

= p n ([x n , oo) x (-co, £ n + v n y\) (23) 
+ k 1 / 2 {u([x n ,oo) x {-cx>,£ n + v n y}) -#(y)} . (24) 

By Proposition 8, the term in (23) converges weakly to W(0, y). By Corollary 9 and the delta 
method, the term in (24) converges weakly to 

dF 1 

—^(0,y)W(g o , Q ) + *\y){(l - l)W(g hl ) + -W(g , 2 )y} • 

□ 



4 Case of a product measure 

In this section, guided by examples (see e.g. Fougeres and Soulier (2010)), we make the 
following additional assumption. 

Assumption 4. The function ip is an auxiliary function satisfying lirm-^oo ifj(x)/x = 0, there 
exists p G R such that m(x) = px and the measure v is of the form 

u([x, oo] x (-co, y\) = e' x ^(y) , (25) 

where ^ is a distribution function on R. 
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The assumption m(x) = px is satisfied by most known examples. Cf. Fougeres and Soulier 
(2010) for a review of models satisfying these assumptions. It is possible to have m(x) = px 
even when v is not a product measure, as in the case of elliptical distributions with regularly 
varying tails, cf. Abdous et al. (2005). 

The condition lirn^oo ip{x)/x = implies that the extreme value index of X is (cf. 
Resnick (1987, Lemma 1.2)). We now recall the necessary and sufficient condition for v to be 
a product measure proved by Heffernan and Resnick (2007, Proposition 2). 

Lemma 11. The measure v is a product measure if and only if a o b is slowly varying at 
infinity and 

t^oo a o b[t) 

The main consequence of Assumption 4 and of Lemma 11 is that 

ip(x) = o(a(x)) , 

(by application of De Haan and Ferreira (2006, Theorem B.2.21)) and this implies that given 
X > t, (X — t)/a(t) converges in probability to zero. We thus have the following Corollary. 

Corollary 12. If Assumptions 1 and 4 hold then, for all x > and y £ (—00,00), 
lim F(X < t + ip(t)x , Y - pX < a(t)y \ X > t) = (1 - e~ x )^(y) . 

t^oo 

Define the measure v\ on (— 1/7, +00) x [—00, +00] by 



ip o b(n) a o b(n) 
Then v\ converges vaguely on (—00, +00] x [—00, +00] to v. 



4.1 Nonparametric estimation 



Under Assumption 4, we can define new estimators of p, a and the marginal distribution \I/ 
as follows: 



Si=l ^[n:n-i+l]{^(n:n-i+l) ^(n:n-k)} 
Si=l -^(n:n-i+l){^(n:n-i+l) ~ -^(n:n-k)} 



a (X( n:n -k)) 



1 k 



i=l 



1/2 



9(z) 



1 k 



i=l 



(28) 
(29) 
(30) 



Theorem 13. If Assumptions 1, 2 (withp* = 1 and q* = 2) and 4 hold and if p = 0, then for 
any intermediate sequence k, b{n/k){p — p)/aob{n/k) converges weakly to 0, o{X^ n . n _^) /a o 
b(n/k) converges weakly to 1 and ^ is a consistent estimator of ^> . If moreover a{x) = o(x) 
then p converges weakly to p. 
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The proof of Theorem 13 is along the lines of the proof of Propositions 5, 6 and Theorem 7. 
The only difference is that instead of the random measure v n defined in (4) we use the measure 
u n , defined by 

1 n 

K = J,'^2S({X l -b(n/k)}/il>ob(n/k),{Y l -pX 1 }/aob(n/k)) , (31) 
i=l 

which converges weakly to the measure v for any intermediate sequence k, as a consequence 
of Corollary 12 and Resnick (1986, Proposition 5.3). The details are omitted. 

In order to prove central limit theorems, we now introduce a second order assumption 
which is a modification of Assumption 3 that accounts for the random centering. Recall the 
measure v\ defined in (27). 

Assumption 5. There exist positive real numbers p* and q$, a function c such that lim^oo c(t) = 
and a Radon measure p} on (—1/7,00) x (—00,00) such that for any e 6 (0,1/7), and any 
measurable function h such that \h(x,y)\ < (\x\ V l) pt (|y| V l) qt , it holds that 

/oo poo 
/ \h(x,y)\p t {dx,dy) < 00 , 
-e J —00 

and 

/oo poo poo poo 

/ h(x,y)i>l(dx,dy) - / / h(x, y)v(dx, dy) 
-e J — oo J —e J — oo 

/OO f'OO 
/ \h{x,y)\^{dx,dy) . (32) 
-e J —00 



The difference with Assumption 3 is the presence of measure v\ instead of v n . It can be 
shown that Assumptions 3 and 4 with a smoothness assumption on ^ imply Assumption 5, 
but with the same rate function c as in Assumption 3, whereas in some cases Assumption 5 
can be proved directly with a function c which goes to zero at infinity faster than c. The 
following results could be stated under Assumption 3, but the interest of Assumption 5 is to 
take into account the possibility of faster rates of convergence of the estimators than those 
allowed by Assumption 3. 

As an example, consider the case of a bivariate Gaussian vector with standard marginals 
and correlation p. Abdous et al. (2005) have shown that lim^oo P(Y < px + \J\ — p 2 y \ 
X > x) = $>(y) (where $ is the distribution function of the standard Gaussian law), and 
a rate of convergence of order x~ l has been proved in Abdous et al. (2008). But of course, 
since (Y — pX)/ \J\ — p 1 is standard Gaussian and independent of X, for all x it holds that 
< pX + \J\ — p 2 y I X > x) = &(y). For general elliptical bivariate random vectors, it is 
also proved in Abdous et al. (2008) that the rate of convergence with random centering can 
be the square of the rate with deterministic centering. Assumption 5 can also be checked for 
the generalized elliptical distributions studied in Fougeres and Soulier (2010). 

We can now state central limit theorems for d(X^ n . n _ k ^), p and ^> which parallels Corol- 
lary 9 and Theorem 10. The proof is also omitted. 
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Theorem 14. If Assumptions 1, 4 an d 5 hold with > 2 and q$ > 4, if ^ is differentiate 
and if /i = and if the intermediate sequence k is chosen such that 

lim k 1/2 cob(n/k) = , (33) 

n— +oo 

converges weakly in T){{— oo, oo)) to i/ie process M defined in (21) and 

k i/2 ( b{n/k){p- p) a{ X {n:n~k)) _ A 
Y aob(n/k) ' aob(n/k) J 

converges jointly with k l / 2 {^> — ^f) to the Gaussian vector (W(gi t \),W(go t 2))- 

Remark 7. As mentioned above, if we only assume Assumption 3 instead of Assumption 5 
and (33) with c instead of c then the conclusion of the theorem still holds. 



Kolmogorov-Smirnov Test 

In the case 7 = and when the limiting measure v has product form, then -^F(0, y) = $?(y). 
Define B(t) = W(0, \E r_1 (t)). Then B is a standard Brownian motion on [0, 1] and 

W(0, y) - |-F(0, y)W(g ,o) = B o - V(y)B(l) = B o <%) 

where B is a standard Brownian bridge. By the same change of variable, W(gop) can be 
represented as 

V = f {^- l (t)} 2 &B(t) . 
Jo 

Since fi = and y 2 ^(dy) = 1, it is easily seen that 

var(W( 5 i,i)) = 2 , cov(W(g 0fi ),W(g ,i)) = , 
cov(W(0,y),W(g 1 , 1 )) = / z*(dz) = / *-i( u )d«, 



oo 
oo 



/OO /'OO 
z 3 ^(dz) = / {^(u^du. 
-oo JO 

Thus, W(<7i,i) can be represented as 

U= f ^- 1 {s)dB(s) + N , 
J o 

where N is a standard Gaussian random variable independent of the Brownian motion B. 
Since all random variables involved are jointly Gaussian, this shows that M(y) has the same 
distribution as 

BoV(y) + V(y){U + ^yV} . 
Finally, since ^ is continuous, sup^ eK |M(y)| has the same distribution as 



Z = sup 

ie[o,l] 



B(t) + o ^(i^tf + ^\t)V} 



(34) 
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The extra terms come from the estimation of the functions a and m. If they were known, 
the limiting distribution would be the Brownian bridge as expected. Nevertheless, this dis- 
tribution depends only on so it can be used for a goodness-of-fit test. See Section 5.1 for 
a numerical illustration. 

4.2 Semi-parametric estimation 

Two problems arise in practice: the estimation of the conditional probability 9(x, y) = P(Y < 
y | X > x) and of the conditional quantile y = 9*~(x,p) for some fixed p S (0, 1) and for some 
extreme x, i.e. beyond the range of the observations. 

If x lies within the range of the observations, then 9(x, y) can be estimated empirically by 

1 - 

0emp(x,y) = -^2t {Yl <y}l{ Xi >x} , 
i=l 

for x = ^( n:n _fc). The most interesting situation for using the limit distributions that arise 
in Assumption 1 is when x is outside the range of the observations, so that an empirical 
estimate is no longer available. In such a situation, a semi-parametric approach will be 
needed to extrapolate the functions a(x), m(x) and ip(x) for values x beyond X^ n . n y This 
requires some modeling restrictions. We still assume that Assumption 4 holds and we assume 
moreover that there exists a > such that 

a(x) = o\J 'xip(x) . (35) 

We will also assume that the limiting distribution function in (25) is known. These assump- 
tions hold in particular for bivariate elliptical distribution, see Abdous et al. (2008). There, 
and in many other examples, \E r is the distribution function of the standard Gaussian law. 
See also Fougeres and Soulier (2010). Assumption 4 and (35) imply that 

lim 8(x, px + ay xtp(x)z) = ^f(z) , (36) 

x— >oo 

so that 9(x, y) can be approximated for x large enough by 

v( y -p x ) . 

Thus, in order to estimate 9, we need a semi-parametric estimator of ip. For this purpose, we 
make the following assumption on the marginal distribution of X. 

Assumption 6. The distribution function H of X satisfies 

1 - H(x) = e -^{c+0(x^)} 

with (3 > and rj < 0. 

Under Assumption 6, an admissible auxiliary function is given by 

^( x ) = \x l ~P . (37) 
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Under (35), the normalizing function a is then 

a( x ) = -^x 1 -^ 2 . 

Let k and k\ be intermediate sequences. For the sake of clarity, in the sequel, we make explicit 
the dependence of the estimators with respect to k or k±. Semi-parametric estimators of j3 
and a(x) are given by 

fi k = ELi loglogWO - loglog(w/fc) 

Ei=l l Og{ X {n:n-i+l)) ~ ^&{ X {n:n-k)) ' 

/ x \ 1-4/2 
a kl (x) = a kl (X {n , n _ kl) ) — , (39) 

V A (n:n-fci) / 

where afci(-XV n :n-fci)) is the nonparametric estimator defined in (29). 

Proposition 15. If Assumption 6 holds, and if k is an intermediate sequence such that 

lim log(fc)/log(n) = lim Hog 2r? (n) = , (40) 

n— >oo n— »oo 

then k 1 / 2 ^ - (3) converges weakly to the centered Gaussian distribution with variance (3 . 
Suppose moreover that Assumptions 1, 4 o,nd 5 hold with p^ = 2 and = 4 and that /i = 
and (35) holds. Let (x n ) be a sequence and k\ be an intermediate sequence such that 

lim k\ /2 cob{n/kx) = (41) 

n— >oo 

lim k/ki = , (42) 
lim log(6(n/fci))/log(a; n ) = 1 , (43) 

n— >oo 

lim k- 1,2 \og{x n ) = . (44) 

k 1 ! 2 ffifc (x n ) 1 

Then - — - — r < — -; — r 1 > converges weakly to the centered Gaussian distribution with 



log(x n ) [ a(x n ) 
variance (3~ 2 . 

Remark 8. By the arguments following Assumption 5, it can be seen that the conclusion of 
Proposition 15 still holds if Assumption 5 is replaced by Assumption 3 and c is replaced by 
c in (41). 

The previous results lead to natural estimators of the conditional probability 6(x,y) = 
< V I X > x) and of the conditional quantile y = 9 ( ~(x,p). Define 

e(x,y) = ^(l^fO . (45) 

Under Assumptions 1, 2, 4 and (35), Theorem 13 implies that for fixed x and y, 6(x,y) is a 
consistent estimator of ^ {{y — px)/a(x)), but a biased estimator of 6(x,y). The remaining 
bias, which is an approximation error due to the asymptotic nature of equation (36), can be 
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bounded thanks to the second order Assumption 5. For more details, see Abdous et al. (2008, 
Section 3.2) for a treatment in the elliptical case. 

We now investigate more thoroughly the estimation of the conditional quantile y n = 
@^(xn,p) for some fixed p £ (0, 1) and some extreme sequence x n , i.e. beyond the range of 
the observations, or equivalently, x n > b(n). An estimator y n is defined by 

Vn = Pki^n + a kl (x n )^~ l {p) , (46) 

where pfa is the nonparametric estimator defined in (28). 

Corollary 16. Let the assumptions of Proposition 15 hold with Assumption 3 instead of 
Assumption 5 and c instead of c in (41), ^' ° ^>~ l {jp) > and 

lim *wy = lim twM = x. 

n— >oo b(n) n— >oo x n 

(i) J/* _1 (p) + 0, then 



log(x n )a(x n ) \y n 

converges weakly to a centered Gaussian law with variance {^^(p) / pfi\ 
(ii) 7/* _1 (p) = 0, then 

i 1/2 r . 
™1 "^n I 2/n 



— - 1 

converges weakly to a centered Gaussian law with variance 2. 



5 Numerical Illustration 



In this section, we perform a small sample simulation study with two purposes. We analyze 
the behavior of the Kolmogorov-Smirnov test proposed in Section 4.1 and we illustrate the 
behavior of the estimator of the conditional quantile proposed in Section 4.2. 



5.1 Goodness-of-fit test for the distribution ^ 



Assume that the hypotheses of Section 4 hold, so that the nonparametric estimation procedure 
described in Section 4.1 can be used. Three types of distributions are considered, each of them 
restricted to the positive quadrant for convenience. These distributions are: 

(a) the elliptical distribution with radial survival function P(R > t) = e - ', and Pearson 
correlation coefficient p = 0.5 ; 

(b) the distribution with radial representation i?(cos[(7r/2 + arcsin p)T — arcsin p], sin[(7r/2 + 
arcsm p)T\), where P(R > t) = e"* 2 / 2 , T has a non uniform concave density function frit) = 
4/ {vr + vr(2t - l) 2 } on [0, 1], and p = 0.5; 

(c) the distribution with radial representation ii(cos[(7r/2 + arcsin p)T — arcsin p], sin[(-7r/2 + 
arcsin p)T]), where P(R > t) = e"' 2 / 2 , T has a non uniform convex density function /t(£) = 
2-4/ {vr(l + (2t - l) 2 } on [0, 1], and p = 0.5. 
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Case (a) is an example of the standard elliptical case, for which estimation results already 
exist (see Abdous et al. (2008)), whereas (b) and (c) illustrate the situation where the density 
level lines are "asymptotically elliptic" (see Fougeres and Soulier (2010)). In these three cases, 
\& is the Normal distribution function (denoted by 3>), and Assumption 6 is fulfilled with (3 = 2. 
Figure 1 illustrates the estimation of \P via the nonparametric estimator ^ defined by (30) 
for one sample (n = IP™ '■ ~ £ ' M 




Figure 1: Estimation of \& via the nonparametric estimator ty n for one sample (n = 1000, k = 
100) of distribution (b). 

The Kolmogorov-Smirnov goodness-of-fit test performed here admits therefore as test 
statistic 

T KS = supVk\9(y)-$(y)\ . (47) 

2/GM 

As shown in Section 4.1, Tks has asymptotically the same distribution as the random variable 
Z defined in (34). Quantiles of this distribution have been obtained numerically and are listed 
in Table 1. 



Table 1: Quantiles q a of order 1 — a of Z. 

a 0.01 0.05 0.10 0.15 0.20 0.25 
~q a 1.598 1.297 1.174 1.076 1.029 0.980 



We have compared these theoretical levels to the empirical levels obtained by simulation. 
In the three cases (a) to (c), 1000 samples of size n = 10 3 , 10 4 and 10 5 , are simulated. The 
k observations having the largest first component are kept, for three different values of k, 
and the nonparametric estimate ^ given in (30) is computed with this reduced sample. The 
observed values of the test statistic Tks are compared to the quantiles listed in Table 1 . For 
brevity, we present only the results corresponding to the two theoretical levels a = (0.05, 0.1). 
These empirical levels are shown in Table 2. 

A common feature for the three distributions is that the results are rather sensitive to the 
reduced number of observations k. However, the value of k leading to the best adequation 
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Table 2: Empirical levels (cto.05) &0.1) associated to theoretical levels (0.05,0.1) for the 
goodness- of- fit test with statistic Tks- The original sample size is denoted by n, and the 
number of observations used for the estimation is denoted by k. Notation (a) -(c) refers to 
the three bivariate distributions listed above. The boldface characters point out the best result 
in each case. 



n 


k 


(a) 


(b) 


(c) 




50 


(0.053,0.095) 


(0.031,0.066) 


(0.027,0.050) 


1000 


100 


(0.140,0.231) 


(0.055,0.102) 


(0.04,0.085) 




150 


(0.327,0.453) 


(0.071,0.147) 


(0.077,0.153) 




50 


(0.059,0.095) 


(0.03,0.061) 


(0.028,0.045) 


10000 


100 


(0.052,0.099) 


(0.038,0.07) 


(0.038,0.088) 




200 


(0.101,0.183) 


(0.054,0.096) 


(0.065,0.125) 




100 


(0.051,0.082) 


(0.037,0.075) 


(0.044,0.071) 


100000 


200 


(0.080,0.133) 


(0.041,0.087) 


(0.0795,0.128) 




500 


(0.140,0.257) 


(0.05,0.103) 


(0.20,0.298) 



between empirical and theoretical levels is rather stable in most cases studied (k = 100 in 
two thirds of the cases). 

5.2 Semi-parametric estimation of the conditional quantile function 

Assume that Assumptions 1, 4, 6 and equation (35) hold and that the limiting distribution ^ is 
the standard Gaussian distribution $ and . The small sample behavior of the semi-parametric 
estimator y n {p) of the quantile function 8^(x n ,p) defined by Equation (46) is illustrated in 
Figure 2 for the three distributions presented in Section 5.1. In each case, 100 samples of 
size 10000 are simulated. A proportion of 1% of the observations is used, which are the 100 
observations with largest first component. For each sample, the conditional quantile function 
6^(x,p) is estimated for two values of x corresponding to the theoretical X-quantiles of order 
1 — e, where e = 10 -4 and e = 10 -5 . Figure 2 summarizes the quality of these estimations by 
showing the median, and the 2.5%- and 97.5%-quantiles of y n (p) for the two fixed values of 
x specified above. 

The estimation results are globally good, and the best ones are obtained for cases (a) 
and (c), see rows 1 and 3 of Figure 2. Besides, one can observe a slight improvement as the 
conditioning event becomes more extreme. 

These empirical interval confidence compare well with those obtained by applying the 
central limit theorem of Corollary 16. We do not show them on Figure 2 for the sake of 
clarity. 

6 Data analysis 

To illustrate the use of the new procedures, and more specifically the Kolmogorov-Smirnov 
goodness-of-fit test proposed in Section 4.1, the hypothesis of ^ = <£, where $ is the standard 
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Table 3: Observed values txs of the test statistic Tks defined by (47) in terms of the propor- 
tion r or number k of observations used. 



r 0.05 0.10 0.15 0.20 
k 22 45 68 91 ~ 
Tk~ s 0.842 0.847 0.777 0.948 



Gaussian cdf, is tested using the series of monthly returns for the 3M stock and the Dow 
Jones Industrial Average from January 1970 to January 2008 (n = 457 values). These data 
were used by Levy and Duchin (2004) and revisited by Abdous et al. (2008). In the latter 
paper, the hypothesis of bivariate ellipticity was accepted through a test of elliptical symmetry 
proposed by Huffer and Park (2007) and the contagion from the Dow Jones to the 3M stock 
was tested. As shown in Abdous et al. (2005), ellipticity implies that Condition (1) holds and 
that the limiting distribution is the Gaussian law. The present procedure allows to test for the 
Gaussian conditional limit law without assuming ellipticity, but the weaker assumption (1). 
The observed values of the test statistic Tks defined by (47) in terms of different choices 
of threshold k (or equivalently in terms of the proportion r of observations used, k = nr) 
are summarized in Table 3. According to Table 1, all these observed values correspond to a 
p- value greater than 0.25, which leads to accept the hypothesis ^ = 3>. 



7 Proofs 



Proof of Proposition 3. By Proposition 2, the weak convergence of v n to v implies that for 
any compact set K of (—1/7,00) x (—00,00) such that v(dK) = and any function h, it 
holds that 

lim // h(x, y)i> n (dx, dy) = // h(x,y)u(dx,dy) in probability. 
n ^°° JJk J Jk 

For e,M > 0, e < I/7, define K = [— e, M) x [— M, M] and K c = [-e, 00) x (-00, oo)\K. Let h 
be a nonnegative function on [— e, 00) x (— 00, 00) such that h(x, y) < C(|x|Vl) 9 _1 (|y|Vl) p . 
We must prove that 



limsup lim / / h(x , y)D n (dx , dy) = , 



(48) 



in probability. Since 



E 



h(x,y)i> n (dx,dy) 



K<~ 



K° 



K x ,y)vn/k{x,y) , 



Assumption 2 implies that 



lim lim sup E 

M— >oo n^oo 



h(x,y)v n (dx,dy) 



K c 



lim limsup / / h(x,y)v n/k (dx,dy) = lim // h(x,y)u(dx,dy) . 

M->co n ^oo J J K c M-*oo J J Rc 



This yields (48) and concludes the proof of Proposition 3. 



□ 
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Proof of Proposition 5. Write 

M X (n:n-k)) -mo b(n/k) _ Sn 
aob(n/k) T n 



with 



„ _ ly Y[n:n-i+l] — m b(n/k) X( n:n _ i+1 ) - X( n:n _ fc ) 

n k aob(n/k) ipob(n/k) 

1 J—^ X (n:n-i+l) ~ X (n:n-k) 



k ^ i>ob(n/k) 



i=l 

We have already seen that T n converges weakly to 1/(1 — 7). Recall that we have defined 

„ _ Xjrun-k) ~ b(n/k) 
i/j o b(n/k) 

By definition of v n , we have, (with x+ = sup(x,0) for any real number x) 
Sn -k2^ a(X {mn _ k) ) \^ob(n/k) X »I + -J in J_J X 

00 poo rXn /*oo roc roo 

xyu n (dx,dy)- / xyu n (dx,dy)-x n / yi> n (dx,dy). (49) 

JO J —00 J x n J— 00 

By Proposition 3, the first term in (49) converges to p/(l — 7). Under Assumption 1, it is 
well known that x n = op(l). Cf. De Haan and Ferreira (2006, Theorem 2.2.1). This and 
Assumption 2 imply that the last two terms in (49) are op(l). Thus S n converges weakly to 
/i/(l — 7) by Proposition 3. If m(x) = px, then p — p ~ X7 n _ k ^a(X^ n . n _ k ^)p which converges 
to if a(x) = o(x) or if p = and a(x) = O(x). □ 

Proof of Proposition 6. We show that a 2 (X^ n:n _ k ^)/a 2 o b(n/k) converges weakly to 1. Recall 
that £ n = {in(Xr n . n _ k \) — mo b(n/k)}/a o b(n/k). By Proposition 5, £ n = op(l), and noting 
that v n {[x n ,oo] x [-co, 00]} = 1, where v n and x n are respectively defined by (4) and (22), 
we have 



l ( X {n:n-k)) 1 A [ F; - m O 6(n/fc) 



a 2 ob(n/k) k 4^f \ aob(n/k) J {V°f>(n/*o 

/•OO /'OO 

= / / (y- ^n) 2 Z>n(dx,d?/) 

/"CO pOO /"OO /"OO 

/ / V 2 v n {dx,dy) -2£ n I / j/ z>„(dx, dy) + & 

J Xn, J — OO J Xn J —OO 

y 2 i> n (dx,dy) + o P (l) . 

Thus a(X( n . n _ k ))/a o b(n/k) converges weakly to 1 by Proposition 3 and equation (7). □ 



OO 

oo roo 
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Proof of Proposition 8. We start by proving the convergence of the finite dimensional dis- 
tributions of W n . Denote G n (x,y) = v n ((x,oo) x (— oo,y]), G(x,y) = z/((x,oo) x (— oo,y]), 
Xi = {X t - b(n/k)}/if> o b(n/k), % = {Y { -mo b(n/k)}/a o b(n/k) and 

tn,i(x,y) = A: " 1/2 { 1 {x l >x,y,<,} -nXi >x,Yi<y)} 

= k ~ 1/2 {\x i >x,Y i <y}- kn ~ lG n/k{x,y)} . 

Then for each n, the random variables £ nj j, 1 < i < n are i.i.d., 

1 k 

cov(£ nj i(x,y),£ nti (x',y')) = -G n/k (x V x', y A y 1 ) - G n/k (x, y)G n/k (x', y') , 

and 

n 

W n (x,y) = ^^ n ,i(^,y) + k 1/2 {G n/k (x,y) - G(x,y)} . 
i=i 

Assumption 3 and (19) imply that k l l 2 (G n / k — G) converges to zero locally uniformly. The 
Lindeberg central limit theorem (cf. Araujo and Gine (1980)) and (19) yield the convergence 
of finite dimensional distributions of Y17=l £n,i(%, y) to the Gaussian process with autocovari- 
ance defined by (20). Tightness can be obtained as in Einmahl et al. (1993) by using an 
exponential inequality such as Inequality 1 in the aforementioned reference. 

We now prove the second part of Proposition 8. Let h a be C°° function with com- 
pact support in (—1/7,00) x (— 00,00). The weak convergence of W n in T>((— 1/7,00) x 
(—00,00)) implies that ff h(x, y)W n (x, y) dx dy converges weakly to JJ h(x,y)W(x,y) dxdy. 
Thus, by integration by parts, it also holds that JJ h(x,y)W n (dx,dy) converges weakly to 
JJ h(x, y)W(dx, dy). Let e E (0, 1/7) and define A = [— e, 00) x (—00, 00). Let g be a measur- 
able function defined on A such that \g{x,y)\ 2 < C(\x\ V l) pt (|y| V l) 9t . Then, for all e > 0, 
there exists a C°° function h with compact support in A such that J A {g — h) 2 dv < e. Then, 



/ g djl n = h djl n + / (g-h) dp, n 
J A J A J A 



The first term in the right hand side converges weakly to W(h) and we prove now that the 
second one converges in probability to 0. Denote u = g — h and 

= k 1/2 {v n/k - v] . 

Then, 

/ udfi n = k~ x l 2 S^{u(Xi,Yi) — ~E[u(Xi,Yi)]} + / ud/i n . 
Ja l=1 Ja 

By definition, for any function v, K[v(Xi)] = kn~ l J vdv n / k , thus 

f udji^j < j u 2 du n / k + jy nd/i n | . 
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By assumption on g, and since h has compact support, it also holds that u 2 (x,y) < C(\x\ V 
l) pt (|y| V l) qt . Thus, by Assumption 3 and (19), it holds that lim^^oo J A u d// n = and 
lim^oo f A u 2 dv n = j A v? dv . Thus 



lim sup E 



2 



: / u 2 dv < e 



Taking into account that var(W(g) — W(h)) = vav(W(g — h)) = J A (g — h) 2 dv < e, we conclude 
that W n (g) converges weakly to W(g). □ 

Proof of Corollary 9. We prove separately the claimed limit distributions. The joint conver- 
gence is obvious. We start with x n , defined in (22). Denote G ra (x) = v n ((x, oo) x (— oo, +oo)). 
By Proposition 8, k l l 2 (G n - P y ) converges weakly in T> to the process BoP 7 , where B is 
a standard Brownian motion on [0,1]. By Vervaat's Lemma (De Haan and Ferreira, 2006, 
Lemma A. 0.2), k l l 2 {G^ —P^~} jointly converges weakly in V to — (P^~)'B. Since GJ^(l) = x n , 
-Pp(l) = and (P^T)'(l) = —1, we get the claimed limit distribution for k x l 2 x n . 

We now consider £ n , defined in (15). By definition, 

. _ J2i=l{ X (n:n-i+l) ~ X( n:n _ k )}{Y[ n . n _ i+1 ] - mo b(n/k)} 

kip o b(n/k)a o b(n/k) 

J2i=l{X(n:n-i+l) ~ X (n:n~k)} 

kip o b(n/k) 

= !Z I-oc( X ~ Xn)yi>n{dx, dy) 

Since fj, = by assumption, we obtain 

JT /^o( x ~ x n )yfin(dx, dy) 



k ^ Cn 



JT f-oo( x - Xn)£>n(dx,dy) 



)Xn J— oo 

Since x n = Op(k~ 1 / 2 ), it is easily seen that 

k y 2 _ Jo" JZo ^An(dx, dy) + Op(l) 

Io° I-oc xD n (dx, dy) + o P (l) 

Applying Propositions 3 and 8, we obtain that k l / 2 ^ n converges weakly to (1 — r ))W{gi t \). 
Consider now d(A( n:n _ fc )). As in the proof of Proposition 6, we write 



a2 (X( n : n -k)) 



oo roo roo roo 



a 2 o (n/k) 

and since x n = Op(k~ 1 ^ 2 ) and £ n = Op(k~ 1 ^ 2 ), we get 

j2f V , , "\ I poo /■oo 



y v n (dx, dy) - 2£ n / / yi> n (dx, dy) + £ n , 



,1/2 I ° (^(n:n-fc)) 

I a 2 o b(n/k) 



y fl n (dx,dy) + o P (l) 



Proposition 8 and the delta method yield that k 1 ^ 2 {a(X^ n:n _^)/a o b{n/k) — 1} converges 
weakly to 5^(50,2)- □ 
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Proof of Proposition 15. The asymptotic normality of (3 k is proved (under more general con- 
ditions) in Gardes and Girard (2006, Corollary 1). Consider now a kl (x n ). By (35) and (37), 



a{x) = a(X {n . n ^ kl) ) f 



x * ^ 



■^■(nm—ki) 



thus, by (39), we obtain 



QfclpEn) = QfciPf(n:n-fci)) x k -f3)/2 (/3-/3 fc )/2 

a(x n ) a(X (n:n _ kl) ) (™-*0 n 



Decomposing further, we get 



Ofcifon) _ 1 = f gfa (^"(n:n-fci)) _ 1 ^(/3 fc -/3)/2 Q3-$ h )/2 

a(x n ) \ a(X {n:n _ kl) ) J n 



(50) 



+ K^-i}R^ )/2 -i} (51) 

Since /3 k - f3 = Op(k~ 1 / 2 ), log(x n ) = o(k 1 / 2 ) and fe/fei — ► 0, we obtain 
s (fMfc)/2 -i„(p-$ k ) Iog(x„)/2 , 

^(L-t 7 ) 2 - 1 ~ (/? - ft) log(X (n:n _ fel) )/2 ~ (/? - 4) log(6(n/fci))/2 , 

where the equivalence relations above hold in probability. Thus, by the first part of Propo- 
sition 15 and (43) the product in (51) is Op(k~ 1 \o^(x n )) = o P (k~ 1/2 log(x„)) by (44). By 

— 1/2 

Theorem 14, ak x (X(n\n-ki)) / a (X(n:n-ki)) ~ 1 = Op(k 1 ), thus the term in the right hand 
side of (50) is Op(k 1 1 ^ 2 ) = op(k~ 1 ^ 2 log(x n )) since k/k\ — ► 0. Altogether, these bounds 
yields, 



fcV2 



_ /^)-l) =fc V2 (/3 _ 4) + 0p(1); 
n) I O(ain) J 



\og(x % 

and the proof follows from the asymptotic normality of k l l 2 ([3 — /3&). □ 
Proof of Corollary 16. Define y n = px n + a(x n )\I/ (p). Then 

yn-yn = yn-yn + y n - y n 

= (fiki ~ P) x n + (hi ( x n) ~ a(^n))^~ 1 (p) + Vn ~ Vn ■ 

In order to study y n — y n , denote z n = (y n — px n )/a(x n ). Then lirrin^oo z n = ^~ l (p). Indeed, 
if the sequence z n is unbounded, then it tends to infinity at least along a subsequence. Choose 
z > Then, for large enough n, 

p = P(y < pXn + a(x n )z n | X > x n ) > P(Y < px n + a(x n )z \ X > x n ) 
-» *(z) >p. 
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Thus the sequence z n is bounded, and if it converges to z (along a subsequence), it necessarily 
holds that ^(z) = p, thus z n converges to * _1 0). Since we have assumed that a(x) = o(x), 
this implies that y n ~ px n and 

Vn ~ Vn a(x n ){^~ l (p) - Z n } 

~ ► U . 

Vn P%n 

Moreover, since ty' o > 0, by a first order Taylor expansion, we have 

where £ n = + u{z n — ^~ 1 {p)} for some u £ (0, 1). By Assumption 3, \\9(x n ,px n + 

a{x n )-) — V^Hoc = 0(c o b(n)). Since we have already shown that z n converges to ^~ 1 (p), 
V^'C&i) i s bounded for large enough n, so - Zn = 0(cob(n)). Thus, by (41) (with c 

instead of c) , we get 

fc 1/2 a^n Vn-Vn _ Q ( fc 1/2 co6(re) \ _ / fc| /2 co 6(n) \ _ 



log(z n )a(x n ) y„ y log(x n ) J y log (a: 

Next, by definition, and since y n ~ /32; n and a(x n ) = o(x n ), we have 

Vn-Vn Pk x ~ P ajXn^^jp) ( Ofci(gn) _ j 



2/n P P#n [ o(x n ) 

Thus, 

fc 1 / 2 X n Vn-Vn ^^Xnjpk!- p) ^~ l {p) k 1 / 2 f a kl (x n ) ^ 



\og{x n )a(x n ) y n pa(x n )log(x n ) + p log(x n ) t a(x n ) 

The first term in the right-hand side tends to zero by Theorem 14 and the assumptions on the 
sequences kx, k and x n . The second term converges weakly to a centered Gaussian law with 
variance {^ _1 (p) / '(p(3)} 2 by Proposition 15. In the case \E'~ 1 (p) = 0, the main term is the first 
one in the right-hand side of the last display, and we conclude by applying Theorem 14. □ 
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Figure 2: Median (solid line), 2.5%- and 97.5%-quantiles (dashed lines) of the estimated 
conditional quantile function y = 9' r ~{x,p) defined in (46) and theoretical conditional quantile 
function y (dotted line) as a function of the probability p € (0, 1). Each row (from 1 to 3) 
corresponds to a distribution (from (a) to (c)) as described in Section 5.1. Each column 
refers to a different value of x, respectively corresponding to the theoretical A-quantiles of 
order 1 — e, where e = 10 -4 and p = 10~ 5 . 
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